Note: if you open this notebook in an IDE like RStudio, please make sure to set the working directory to the folder where this notebook is located. Otherwise, the code will not work as intended.

1 Purpose of this notebook

This notebook is intended to make you familiar with processing and analyzing smartphone survey data in R. It uses data exported from a demo survey on Emotionality and Social Interaction, which was conducted with the GESIS AppKit 🐿️. The data is available in the data folder of this repository.

The main goal of this notebook is to make the user familiar with importing, cleaning and inspecting the data. It also includes some basic graphical representations of the data.

The notebook is not intended to be a comprehensive guide to smartphone survey data analysis, but rather a starting point for further exploration and analysis. For any further statistical analysis of intensive longitudinal data analysis, make sure to check out the 2 day GESIS workshop by Dr. Lukas Otto.





2 Introduction

The demo study in this notebook is intended to replicate the findings of the study by Barrett et al. (2010) on emotionality and social interaction, which looks at the self-reported emotionality of individuals and differences between men and women. The study’s main finding:

We predicted and found that sex-related differences in emotion in global self-descriptions, but not in the averaged momentary ratings of emotion.

The study consisted of the following survey:

  • An entry survey distributed to participants after they logged into the study on the AppKit app.
  • A momentary assessment survey that was sent to participants at 8 random times during the day for a period of 4 days.
  • An exit survey that was distributed to participants after momentary assessment period.

You can find more information on the GESIS AppKit and request your own account on the GESIS AppKit website. You can also find detailed documentation on how to use the AppKit here





3 Data processing

3.1 Setup

In order to run the code in this notebook, you need to install and load the following packages:

# List of required packages
packages <- c("ggplot2",
              "anytime",
              "dplyr",
              "psych",
              "corrplot",
              "cli",
              "lubridate",
              "tidyverse",
              "ggpubr")

# Check if each package is installed and install if missing
for (package in packages) {
  if (!require(package, character.only = TRUE, quietly = TRUE)) {
    message(paste("Installing package:", package))
    install.packages(package, repos = "https://cloud.r-project.org/")
    library(package, character.only = TRUE)
  } else {
    message(paste("Package", package, "is already loaded"))
  }
}
## Warning: package 'anytime' was built under R version 4.3.3
## Warning: package 'psych' was built under R version 4.3.3
message("All required packages are now installed and loaded")

We will also load some convenience functions tailored to working with AppKit data. If you want to check out the code, you can find it in the appkit_helpers.R file.

source("appkit_helpers.R")

3.2 Importing and auto-formatting data

With the helper functions loaded, we can go ahead and use them to import the data.

The data are being read from the data folder. The read_appkit_surveys() function will automatically read all .csv files in the specified folder and combine them into a single data frame. The resulting data frame will contain all the survey responses, including the metadata (e.g., participant ID, survey ID, etc.) and the actual survey responses.

surveys <- read_appkit_surveys('data', na_value = -1)

Let’s take a look at the survey contained in the surveys list.

# Check the names of the surveys
names(surveys)
## [1] "momentary_assessment" "exit_survey"          "entry_survey"

As you can see, the surveys list contains three surveys. The first one contains the momentary assessment surveys for each participant, while the second and third ones are the pre- and post-surveys, entry and exit surveys, respectively.

To make things easier, let’s assign the surveys to their own objects:

entry <- surveys$entry_survey
ma <- surveys$momentary_assessment
exit <- surveys$exit_survey

Let’s inspect each of the surveys in turn. We will start with the entry survey.





4 Entry Survey

Let’s check the contained variables first.

glimpse(entry)
## Rows: 7
## Columns: 30
## $ personalParticipantCode <chr> "4wbtjyiq", "x6mk28vz", "odsv4l6h", "s4sm1bs4"…
## $ generalLoginCode        <chr> "EMOTIONALITY", "EMOTIONALITY", "EMOTIONALITY"…
## $ committed               <chr> "10-Mar-2025 09:49:56", "10-Mar-2025 12:06:17"…
## $ scheduled               <chr> "10-Mar-2025 09:27:54", "10-Mar-2025 12:03:49"…
## $ published               <chr> "07-Mar-2025 12:43:15", "07-Mar-2025 12:43:15"…
## $ osVersion               <chr> "iOS 18.3.1", "iOS 18.3.1", "iOS 18.3.1", "iOS…
## $ smartphoneType          <chr> "iPhone12,8", "iPhone12,8", "iPhone14,6", "iPh…
## $ aim01                   <fct> Almost never, Occasionally, Almost never, Almo…
## $ aim02                   <fct> NA, Almost always, Almost never, Almost never,…
## $ aim03                   <fct> Occasionally, Occasionally, Occasionally, Occa…
## $ aim04                   <fct> Almost always, Usually, Almost never, Occasion…
## $ aim05                   <fct> Usually, Almost always, Usually, Almost always…
## $ aim06                   <fct> Occasionally, Occasionally, Usually, Occasiona…
## $ aim07                   <fct> Usually, Usually, Almost always, Always, Alway…
## $ aim08                   <fct> Almost always, Almost always, Usually, Usually…
## $ aim09                   <fct> Usually, Usually, Usually, Always, Always, Alm…
## $ aim10                   <fct> NA, NA, Usually, Occasionally, Almost always, …
## $ aim11                   <fct> Occasionally, Occasionally, Almost never, Neve…
## $ aim12                   <fct> Almost never, Almost never, Occasionally, Almo…
## $ aim13                   <fct> Almost never, Almost never, Almost always, Occ…
## $ aim14                   <fct> Occasionally, Almost never, Usually, Almost ne…
## $ aim15                   <fct> Occasionally, Occasionally, Occasionally, Occa…
## $ aim16                   <fct> NA, NA, Almost never, Always, Always, Almost a…
## $ aim17                   <fct> Almost never, NA, Almost never, Almost never, …
## $ aim18                   <fct> Almost always, NA, Occasionally, Almost always…
## $ aim19                   <fct> NA, NA, Almost never, Always, Occasionally, Ne…
## $ aim20                   <fct> Occasionally, NA, Almost never, Almost never, …
## $ age                     <int> 62, 42, NA, 36, 21, 34, 39
## $ sex                     <fct> NA, NA, Male, Female, Male, Male, Male
## $ gender                  <fct> NA, Female, Male, Female, Male, Male, Male

We can see that the entry survey contains 7 rows and 30 columns. This indicates that there are 7 participants and 30 questions in the entry survey.

The first few variables contain metadata on the participant and their device:

  • personalParticipantCode Used to identify the participant in the study.
  • generalLoginCode Stores the general login code shared among participants to participate in the study.
  • committed Timestamp indicating when the participant’s response was committed.
  • scheduled Timestamp indicating when the survey was scheduled for the participant.
  • published Timestamp indicating when the survey was set to published in the AppKit backend.
  • osVersion Holds information about the participant’s operating system version.
  • smartphoneType Describes the participant’s smartphone or device type

We can use this metadata to get an overview of who participated in your study on what devices.


4.1 Devices and Operating Systems

table(entry$osVersion, entry$smartphoneType, useNA = "always")
##             
##              2107113SG CPH2305 iPhone12,8 iPhone14,4 iPhone14,6 <NA>
##   Android 14         1       0          0          0          0    0
##   Android 15         0       1          0          0          0    0
##   iOS 18.3.1         0       0          3          1          1    0
##   <NA>               0       0          0          0          0    0

Looks like we have 2 users on recent versions of Android and 5 iPhone users on iOS 18.3.1. Luckily, we have no hackers running Android on their iPhones in our sample.

We can, of course, also visualize this information. Let’s create a bar plot showing the number of participants per smartphone type and operating system version.

ggplot(entry, aes(x = smartphoneType)) +
  geom_bar(fill = "dodgerblue") +
  labs(title = "Smartphone Types",
       x = "Smartphone Type",
       y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

ggplot(entry, aes(x = osVersion)) +
  geom_bar(fill = "orchid") +
  labs(title = "Distribution of Operating Systems",
       x = "OS Version",
       y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

We can also check the distribution of operating systems across different smartphone types because people using the same smartphone do not necessarily use the same operating system.

ggplot(entry, aes(x = osVersion, fill = smartphoneType)) +
  geom_bar(position = "dodge") +
  labs(title = "Operating Systems by Smartphone Type",
       x = "OS Version",
       y = "Count",
       fill = "Smartphone Type") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Now that we’ve established our participants’ mobile computing environment, let’s take a look at the other variables contained in our entry survey. For this we will rely on a function from our AppKit helpers file.

meta_data_entry  <- explore_vars(entry)
## 
## ── Variable Explorer ─────────────────────────────────────────────────────────── 
## 
## personalParticipantCode: PersonalParticipantCode 
##   Unique values:  4wbtjyiq, x6mk28vz, odsv4l6h, s4sm1bs4, c1nn4w08, ... 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## generalLoginCode: GeneralLoginCode 
##   Unique values:  EMOTIONALITY 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## committed: Committed 
##   Unique values:  10-Mar-2025 09:49:56, 10-Mar-2025 12:06:17, 09-Mar... 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## scheduled: Scheduled 
##   Unique values:  10-Mar-2025 09:27:54, 10-Mar-2025 12:03:49, 09-Mar... 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## published: Published 
##   Unique values:  07-Mar-2025 12:43:15 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## osVersion: OS Version 
##   Unique values:  iOS 18.3.1, Android 15, Android 14 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## smartphoneType: SmartphoneType 
##   Unique values:  iPhone12,8, iPhone14,6, CPH2305, 2107113SG, iPhone... 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim01: When I feel happiness, it is a quite type of contentment. 
##   Unique values:  Almost never, Occasionally, Almost always, Usually 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim02: When a person in a wheelchair can’t get through a door, I have strong feelings of pity. 
##   Unique values:  Almost always, Almost never, Never, Usually 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim03: I get upset easily. 
##   Unique values:  Occasionally, Almost never 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim04: When I succeed at something, my reaction is calm contentment. 
##   Unique values:  Almost always, Usually, Almost never, Occasionally 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim05: I get really happy or really unhappy. 
##   Unique values:  Usually, Almost always, Almost never, Occasionally 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim06: I am a fairly quiet person. 
##   Unique values:  Occasionally, Usually, Almost always 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim07: When I’m happy, I feel fairly energetic. 
##   Unique values:  Usually, Almost always, Always, Occasionally 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim08: Seeing a picture of some violent car accident in a newspaper makes me feel sick to my stomach. 
##   Unique values:  Almost always, Usually, Never, Almost never, Occas... 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim09: When I’m happy, I feel like I’m bursting with joy. 
##   Unique values:  Usually, Always, Almost never 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim10: I would be very upset, if I got a traffic ticket. 
##   Unique values:  Usually, Occasionally, Almost always, Almost never 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim11: Looking at beautiful scenery really doesn’t affect me much. 
##   Unique values:  Occasionally, Almost never, Never 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim12: The weather doesn’t affect my mood. 
##   Unique values:  Almost never, Occasionally, Never, Usually 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim13: Others tend to get more excited than I do. 
##   Unique values:  Almost never, Almost always, Occasionally, Usually 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim14: I am not an extremely enthusiastic individual. 
##   Unique values:  Occasionally, Almost never, Usually, Almost always 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim15: Calm and cool’ could easily describe me. 
##   Unique values:  Occasionally, Almost always, Always 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim16: When I’m feeling well it is easy for me to go from being in a good mood to being really joyful. 
##   Unique values:  Almost never, Always, Almost always, Usually 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim17: When I worry, it is so mild I hardly notice it. 
##   Unique values:  Almost never, Never, Usually 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim18: I get overly enthusiastic. 
##   Unique values:  Almost always, Occasionally, Never 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim19: My happy moods are so strong that I feel like I’m ‘in heaven’. 
##   Unique values:  Almost never, Always, Occasionally, Never 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## aim20: When something bad happens, others tend to be more unhappy than I. 
##   Unique values:  Occasionally, Almost never, Always, Almost always,... 
##   Value labels:  1 = Never, 2 = Almost never, 3 = Occasionally, 4 =... 
## ──────────────────────────────────────────────────────────────────────────────── 
## age: How old are you (in years)? 
##   Unique values:  62, 42, 36, 21, 34, 39 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## sex: Which sex was assigned to you at birth? 
##   Unique values:  Male, Female 
##   Value labels:  1 = Female, 2 = Male, 3 = Intersex 
## ──────────────────────────────────────────────────────────────────────────────── 
## gender: What is your gender? 
##   Unique values:  Female, Male 
##   Value labels:  1 = Female, 2 = Male, 3 = Another gender (e.g. non... 
## 
##  ────────────────────────────────────────────────────────────────────────────────

We can see that each variable has an assigned label as well as value labels for the response options. If you want to focus on one specific variable, you can use the output object of the explore_var() function.

meta_data_entry[meta_data_entry$var_name == "gender",]

Before we dive in deeper, let’s get a good overview of our sample, let’s look at some demographics, starting with sex and gender.


4.2 Demographics

# participant gender
table(entry$gender, useNA = "always")
## 
##                           Female                             Male 
##                                2                                4 
## Another gender (e.g. non-binary)                             <NA> 
##                                0                                1
# participant sex
table(entry$sex, useNA = "always")
## 
##   Female     Male Intersex     <NA> 
##        1        4        0        2
# Sex by Gender
table(entry$gender, entry$sex, useNA = "always")
##                                   
##                                    Female Male Intersex <NA>
##   Female                                1    0        0    1
##   Male                                  0    4        0    0
##   Another gender (e.g. non-binary)      0    0        0    0
##   <NA>                                  0    0        0    1

We can also plot them in a combined overview plot

# Sex by Gender
ggplot(entry, aes(x = sex, fill = gender)) +
  geom_bar(position = "dodge") +
  labs(title = "Sex by Gender",
       x = "Sex",
       y = "Count",
       fill = "Gender") +
  theme_minimal()

How old are our participants? Let’s take a look at the age distribution.

summary(entry$age)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   21.00   34.50   37.50   39.00   41.25   62.00       1
#boxplot
ggplot(entry, aes(y = age)) +
  geom_boxplot() +
  labs(title = "Participant Age",
       y = "Participant Age") +
  theme_minimal()
## Warning: Removed 1 rows containing non-finite values (`stat_boxplot()`).

# histogram
ggplot(entry, aes(x = age)) + 
  geom_histogram(binwidth = 1, color = "black", fill = "steelblue") +
  labs(title = "Participant Age Distribution",
       x = "Participant Age",
       y = "Count") +
  theme_minimal()
## Warning: Removed 1 rows containing non-finite values (`stat_bin()`).

Looks like we’re covering quite a bit of ground! Maybe a bit too much.


4.3 Affect Intensitiy Measure (AIM)

In the entry survey, we are administering the 20-item verison of the Affect Intensitiy Measure (AIM). The AIM scale is a self-report measure of the intensity of emotional experience. Participants are asked to rate the intensity of their emotional experience on a 7-point Likert-scale ranging from 1 (never) to 7 (always). The AIM scale consists of 20 items, the 10 items representing low emotional intensity are reverse-coded.

First, lets have a look at the data and extract the item’s into their own dataframe.

# processing AIM20 scale https://labs.psychology.illinois.edu/~ediener/AIM.html
entry[,8:27] # reverse code: 1,4,6,11,12,13,14,15,17,20,
aim_data <- entry[,8:27]
aim_data <- sapply(aim_data,as.numeric)
aim_data <- as.data.frame(aim_data)

Let’s check which questions need to be reverse coded.

# printing question labels of AIM items
labels <- sapply(entry[,8:27], function(x) attr(x, "label"))
names(labels) <- NULL
labels
##  [1] "When I feel happiness, it is a quite type of contentment."                                      
##  [2] "When a person in a wheelchair can’t get through a door, I have strong feelings of pity."        
##  [3] "I get upset easily."                                                                            
##  [4] "When I succeed at something, my reaction is calm contentment."                                  
##  [5] "I get really happy or really unhappy."                                                          
##  [6] "I am a fairly quiet person."                                                                    
##  [7] "When I’m happy, I feel fairly energetic."                                                       
##  [8] "Seeing a picture of some violent car accident in a newspaper makes me feel sick to my stomach." 
##  [9] "When I’m happy, I feel like I’m bursting with joy."                                             
## [10] "I would be very upset, if I got a traffic ticket."                                              
## [11] "Looking at beautiful scenery really doesn’t affect me much."                                    
## [12] "The weather doesn’t affect my mood."                                                            
## [13] "Others tend to get more excited than I do."                                                     
## [14] "I am not an extremely enthusiastic individual."                                                 
## [15] "Calm and cool’ could easily describe me."                                                       
## [16] "When I’m feeling well it is easy for me to go from being in a good mood to being really joyful."
## [17] "When I worry, it is so mild I hardly notice it."                                                
## [18] "I get overly enthusiastic."                                                                     
## [19] "My happy moods are so strong that I feel like I’m ‘in heaven’."                                 
## [20] "When something bad happens, others tend to be more unhappy than I."

As we can see from, we need to reverse code the following items: 1, 4, 6, 11, 12, 13, 14, 15, 17, 20.

# reverse-coding items
aim_data[,1] <- reverse.code(keys = -1, items = as.matrix(aim_data[,1]), mini = 1, maxi = 7)
aim_data[,4] <- reverse.code(keys = -1, items = as.matrix(aim_data[,4]), mini = 1, maxi = 7)
aim_data[,6] <- reverse.code(keys = -1, items = as.matrix(aim_data[,6]), mini = 1, maxi = 7)
aim_data[,11] <- reverse.code(keys = -1, items = as.matrix(aim_data[,11]), mini = 1, maxi = 7)
aim_data[,12] <- reverse.code(keys = -1, items = as.matrix(aim_data[,12]), mini = 1, maxi = 7)
aim_data[,13] <- reverse.code(keys = -1, items = as.matrix(aim_data[,13]), mini = 1, maxi = 7)
aim_data[,14] <- reverse.code(keys = -1, items = as.matrix(aim_data[,14]), mini = 1, maxi = 7)
aim_data[,15] <- reverse.code(keys = -1, items = as.matrix(aim_data[,15]), mini = 1, maxi = 7)
aim_data[,17] <- reverse.code(keys = -1, items = as.matrix(aim_data[,17]), mini = 1, maxi = 7)
aim_data[,20] <- reverse.code(keys = -1, items = as.matrix(aim_data[,20]), mini = 1, maxi = 7)

# resetting column names
colnames(aim_data) <- colnames(entry[,8:27])

Now that we have reverse coded the correct items, we can calculate the overall sum score for the AIM scale and reattach it as it’s own variable to the entry survey data frame.

# summary scores
aim_score <- rowSums(aim_data)/20

# attaching to dataframe
entry$aim_score <- aim_score

Now that we do have an overall score, we can check out some basic plots and differences

Note: Please keep in mind that the test data set we’re working with has < 10 respondents to give a good overview of the processing steps. The resulting plots and statistics will thus look a bit wonky and are in no way scientifically valid.


4.3.1 AIM by Gender

# Aim score by gender
by(entry$aim_score, entry$gender, summary)
## entry$gender: Female
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    5.15    5.15    5.15    5.15    5.15    5.15       1 
## ------------------------------------------------------------ 
## entry$gender: Male
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   3.550   3.875   4.200   4.050   4.300   4.400       1 
## ------------------------------------------------------------ 
## entry$gender: Another gender (e.g. non-binary)
## NULL
ggplot(entry, aes(x = gender, y = aim_score, fill = gender)) +
  geom_boxplot() +
  labs(title = "Side-by-Side Boxplot of Aim Score by Gender",
       x = "Gender",
       y = "Score") +
  theme_minimal() +
  scale_fill_brewer(palette = "Pastel1")


4.3.2 AIM by Sex

# Aim score by sex
by(entry$aim_score, entry$sex, summary)
## entry$sex: Female
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    5.15    5.15    5.15    5.15    5.15    5.15 
## ------------------------------------------------------------ 
## entry$sex: Male
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   3.550   3.875   4.200   4.050   4.300   4.400       1 
## ------------------------------------------------------------ 
## entry$sex: Intersex
## NULL
ggplot(entry, aes(x = sex, y = aim_score, fill = sex)) +
  geom_boxplot() +
  labs(title = "Side-by-Side Boxplot of Aim Score by Sex",
       x = "Sex",
       y = "Score") +
  theme_minimal() +
  scale_fill_brewer(palette = "Pastel1")


4.3.3 AIM by Age

# Aim score by age
cor_test_results <- cor.test(entry$age, entry$aim_score)
print(cor_test_results)
## 
##  Pearson's product-moment correlation
## 
## data:  entry$age and entry$aim_score
## t = -0.19465, df = 1, p-value = 0.8776
## alternative hypothesis: true correlation is not equal to 0
## sample estimates:
##        cor 
## -0.1910636
ggplot(entry, aes(x = age, y = aim_score)) +
  geom_point() +
  geom_smooth(method = "lm", se = TRUE, color = "blue") +
  labs(title = "Scatterplot of Age by AIM score",
       x = "Age",
       y = "AIM score") +
  theme_minimal()
## `geom_smooth()` using formula = 'y ~ x'





5 Momentary Assessment Survey


5.1 Data Quality

With our insights from the entry survey in mind, we can dive into some descriptive analysis of our momentary assessment survey.

Let’s start by examining the “main part” of our survey, the momentary assessment survey. We will start by looking at the variables contained in the survey. For this we will use the explore_vars() function from our AppKit helpers file again.

explore_vars(ma)
## 
## ── Variable Explorer ─────────────────────────────────────────────────────────── 
## 
## personalParticipantCode: PersonalParticipantCode 
##   Unique values:  c1nn4w08, x6mk28vz, d0n63h9a, 67wibffe, s4sm1bs4, ... 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## generalLoginCode: GeneralLoginCode 
##   Unique values:  EMOTIONALITY 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## committed: Committed 
##   Unique values:  11-Mar-2025 20:05:48, 08-Mar-2025 19:19:17, 09-Mar... 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## scheduled: Scheduled 
##   Unique values:  11-Mar-2025 15:33:00, 08-Mar-2025 19:19:00, 09-Mar... 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## published: Published 
##   Unique values:  07-Mar-2025 12:43:04 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## osVersion: OS Version 
##   Unique values:  Android 15, iOS 18.3.1, Android 14 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## smartphoneType: SmartphoneType 
##   Unique values:  CPH2305, iPhone12,8, 2107113SG, iPhone14,4, iPhone... 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## screening: Welcome (back) for reporting on a recent social interaction!Have you been part of any social interaction in the last two hours?With social interactions we refer to those situations, in which individuals or groups react to each other, interact with one another, influence each other, and guide each other' 
##   Unique values:  FALSE, TRUE 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## description: Please use the interaction you have just thought of as a reference for answering the following questions.&nbsp;Can you briefly describe the type of social interaction? This can make answering the following questions much easier. 
##   Unique values:  , Neuen Kollegen kennengelernt, Geredet über alles... 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## delay: How many minutes ago did this interaction take place? 
##   Unique values:  Less than 30 minutes ago, More than 30 but less th... 
##   Value labels:  1 = Less than 30 minutes ago, 2 = More than 30 but... 
## ──────────────────────────────────────────────────────────────────────────────── 
## duration: How long did the interaction last?For longer interactions (e.g. when you spend a whole day with friends), you can focus on the part that evoked the strongest emotional response. 
##   Unique values:  More than 5 but less than 15 minutes, More than 30... 
##   Value labels:  1 = Less than 5 minutes, 2 = More than 5 but less ... 
## ──────────────────────────────────────────────────────────────────────────────── 
## groupsize: How many people took part in the interaction (you included)? 
##   Unique values:  2, 4, 6, 7, 35, 5, 3, 20, 25, 35000 ... (11 total) 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## partner_gender: What gender do you ascribe to your (main) interaction partner? 
##   Unique values:  Male, Female, Other 
##   Value labels:  1 = Female, 2 = Male, 3 = Non-binary, 4 = Other 
## ──────────────────────────────────────────────────────────────────────────────── 
## partner_close: Please indicate how close you consider the relationship between you and your (main) interaction partner to be. 
##   Unique values:  1, 7, 3, 6, 2, 4, 5 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## interaction_quality: Please indicate your perceived quality of the interaction. 
##   Unique values:  4, 5, 1, 3, 2 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## interaction_intimacy: Please indicate the intimacy of the interaction. 
##   Unique values:  3, 4, 2, 1, 5 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## interaction_control: Please indicate your perceived degree of control over the interaction in comparison to your main interaction partner. 
##   Unique values:  3, 5, 2, 4, 1 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## interaction_intensity: Please indicate the overall intensity of your emotional experience during the interaction. 
##   Unique values:  5, 3, 2, 1, 4 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## emo_happy: Please indicate how happy you felt during the interaction. 
##   Unique values:  4, 3, 1, 2, 5 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## emo_sad: Please indicate how sad you felt during the interaction. 
##   Unique values:  1, 2, 4, 3 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## emo_nervous: Please indicate how nervous you felt during the interaction. 
##   Unique values:  4, 1, 2, 5, 3 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## emo_surprise: Please indicate how surprised you felt during the interaction. 
##   Unique values:  3, 2, 4, 1, 5 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## emo_angry: Please indicate how angry you felt during the interaction. 
##   Unique values:  1, 5, 3, 2, 4 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## emo_embarrassed: Please indicate how embarrassed you felt during the interaction. 
##   Unique values:  1, 2, 4, 3 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## emo_ashamed: Please indicate how ashamed you felt during the interaction. 
##   Unique values:  1, 2, 4, 3 
##   Value labels:  <none> 
## ──────────────────────────────────────────────────────────────────────────────── 
## dataquality: Thank you very much for supporting our research.For ensuring data quality, please indicate whether you answered the questions within this survey conscientiously: 
##   Unique values:  I have answered the questions conscientiously, I d... 
##   Value labels:  1 = I have answered the questions conscientiously,... 
## 
##  ────────────────────────────────────────────────────────────────────────────────

As you can see, participants were asked about their interactions with others throughout the day and how they felt about them.

You can also see that the data structure is different from the entry survey. While the entry dataframe had one row per participant, the momentary assessment dataframe has multiple rows per participant. Lets check this

table(ma$personalParticipantCode, useNA = "always")
## 
## 67wibffe c1nn4w08 d0n63h9a odsv4l6h s4sm1bs4 u1hjr78d x6mk28vz     <NA> 
##       25       32       32        9       32        1       27        0

We can see that we have a different amount of rows per participant, one for each answered survey.

Before we begin in with a more detailed analysis of the survey contents, let’s assess how many of our participants actually completed the surveys. We know that each participant received 8 surveys per day over a period of 4 days, totaling 32 surveys. We can use the personalParticipantCode variable to group the data by participant and then count the number of surveys each participant completed.

# Create a table of scheduled and committed surveys per participant
ma %>%
  group_by(personalParticipantCode) %>%
  summarise(
    scheduled = 4*8,
    committed = sum(!is.na(committed))
  ) %>%
  ungroup() %>%
  mutate(
    completion_rate = committed / scheduled * 100
  ) %>%
  arrange(desc(completion_rate)) %>%
  print(n = Inf)
## # A tibble: 7 × 4
##   personalParticipantCode scheduled committed completion_rate
##   <chr>                       <dbl>     <int>           <dbl>
## 1 c1nn4w08                       32        32          100   
## 2 d0n63h9a                       32        32          100   
## 3 s4sm1bs4                       32        32          100   
## 4 x6mk28vz                       32        27           84.4 
## 5 67wibffe                       32        25           78.1 
## 6 odsv4l6h                       32         9           28.1 
## 7 u1hjr78d                       32         1            3.12

We can also visualize the completion rates of the surveys by day to check if the compliance rate was constant or decreased over time. Keep in mind though that not all participants started their 4-day survey window on the same day!

ma %>%
  mutate(scheduled = dmy_hms(scheduled),
         scheduled_day = as_date(scheduled)) %>%
  count(scheduled_day, personalParticipantCode) %>%
  ggplot(aes(x = scheduled_day, y = n)) +
  geom_line(linewidth = 1, color = "steelblue") +
  geom_point(size = 2, color = "steelblue") +
  facet_wrap(~ personalParticipantCode, ncol = 1) +
  scale_y_continuous(limits = c(0, 10), breaks = 0:10) +
  scale_x_date(date_breaks = "1 day", date_labels = "%d %b") +
  labs(title = "Entries Over Time (Per Participant)",
       x = "Scheduled Date",
       y = "Number of Entries") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Since the surveys were sent out at random, let’s check when the surveys were sent and how long it took participants to respond.

First, let’s visualize, when the surveys were sent out. We will use the scheduled variable to plot the distribution of survey times.

# Convert the time variables to POSIXct

ma <- ma %>%
  # Convert timestamps to POSIXct and create a numeric participant index.
  mutate(
    scheduled = as.POSIXct(scheduled, format = "%d-%b-%Y %H:%M:%S"),
    committed = as.POSIXct(committed, format = "%d-%b-%Y %H:%M:%S"),
    published = as.POSIXct(published, format = "%d-%b-%Y %H:%M:%S"))


# Create a histogram of the scheduled times
ggplot(ma, aes(x = scheduled)) +
  geom_histogram(binwidth = 3600, fill = "dodgerblue", color = "black") +
  labs(title = "Distribution of Scheduled Times",
       x = "Scheduled Time",
       y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

# Create a histogram of the committed times
ggplot(ma, aes(x = committed)) +
  geom_histogram(binwidth = 3600, fill = "forestgreen", color = "black") +
  labs(title = "Distribution of Committed Times",
       x = "Committed Time",
       y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Seems like the distribution of scheduled and committed times is fairly similar, but we can’t really tell from these plots, maybe we need a more advanced graphical representation.

# Define a small vertical offset.
offset <- 0.2

# create plot
ma %>%
  mutate(
    personalParticipantCode = factor(personalParticipantCode),
    ppt_index = as.numeric(personalParticipantCode)
  ) %>%
  distinct(personalParticipantCode, ppt_index) -> ppt_map

ma %>%
  mutate(
    personalParticipantCode = factor(personalParticipantCode),
    ppt_index = as.numeric(personalParticipantCode)
  ) %>%
  ggplot() +
  geom_segment(aes(
    x = scheduled, xend = committed,
    y = ppt_index - offset, yend = ppt_index + offset
  ), color = "grey", linewidth = 1) +
  geom_point(aes(x = scheduled, y = ppt_index - offset),
             color = "blue", size = 3) +
  geom_point(aes(x = committed, y = ppt_index + offset),
             color = "green", size = 3) +
  scale_y_continuous(
    breaks = ppt_map$ppt_index,
    labels = ppt_map$personalParticipantCode
  ) +
  scale_x_datetime(
    date_breaks = "1 day",
    date_labels = "%d %b"
  ) +
  labs(
    title = "Scheduled and Committed Beeps per Participant",
    x = "Date",
    y = "Participant ID",
    caption = "\U1F535 Scheduled    \U1F7E2 Committed"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    axis.text.y = element_text(size = 8)
  )

We can also calculate time between the scheduled and committed times. This will give us a pretty good idea of how long it took participants to respond to the surveys on average.

# Calculate the time difference in seconds
ma <- ma %>%
  mutate(
    time_diff = as.numeric(difftime(committed, scheduled, units = "secs"))
  )

ma %>%
        mutate(time_diff_min = time_diff / 60) %>%
        mutate(time_diff_hours = time_diff_min / 60) %>%
  ggplot(aes(x = personalParticipantCode, y = time_diff_hours)) +
  geom_boxplot(fill = "skyblue", outlier.color = "red") +
  labs(
    title = "Time Difference Distributions per Participant",
    x = "Participant Code",
    y = "Time Difference (h)"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

On average, took participants 3.02 hours to respond to the surveys. The median time difference is 0.52 hours. The maximum time difference is 22.44 hours and the fastest response time was 8 seconds.

We can also check the self-indicated time difference of participants from the reported social interaction and answering the survey

# delay
#table(ma_subset$personalParticipantCode, ma_subset$delay)
#round(prop.table(table(ma_subset$personalParticipantCode,
#ma_subset$delay),margin = 1),2)

ggplot(ma, aes(x = personalParticipantCode, fill = delay)) +
  geom_bar() +
  labs(title = "Stacked Bar Graph of Participant Code and Delay",
       x = "Participant Code",
       y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Yet, not all responses that were commited include a personal interaction and not all answers were given conscientiously. We can check how many interactions we can analyze by contrasting the dataquality variable, which indicates whether the response was given conscientiously, with the screening variable, which indicates whether the participant had an interaction with someone else.

# Create a table of data quality and screening
ma %>%
  mutate(dataquality = ifelse(as.integer(dataquality) == 1, "Conscientious",
                              "Not Conscientious")) %>%
    mutate(screening = ifelse(as.integer(screening) == 1, "Interaction",
                              "No Interaction")) %>%
   select(dataquality, screening) %>%
    table(useNA = "always")
##                    screening
## dataquality         Interaction No Interaction <NA>
##   Conscientious              58              0    0
##   Not Conscientious           2              0    0
##   <NA>                        0             98    0

As a last check, we can explore during what time of the day participants were most likely to respond to the surveys. We can do this by extracting the hour from the committed variable and plotting the distribution of responses.

# reformatting
ma3 <- ma %>%
  mutate(
    committed_hour = factor(format(committed, "%H"), levels = sprintf("%02d", 0:23))
  )

# plotting
ggplot(ma3, aes(x = committed_hour)) +
  geom_bar(fill = "dodgerblue", color = "black") +
  labs(
    title = "Distribution of Committed Hours",
    x = "Hour of Day",
    y = "Count"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Let’s check the emotionality of interactions from the ma dataframe.


5.2 Removing Data

First, let’s reduce the dataframe to the instances where people actually reported on a social interaction and were people indicated to answer the questions conscientiously.

# subsetting dataframe
ma_subset <- ma %>%
  filter(screening == TRUE)
# How many beeps were answered conscientiously
#table(ma_subset$dataquality)
#table(ma_subset$dataquality, ma_subset$personalParticipantCode)

ggplot(ma_subset, aes(x = personalParticipantCode, fill = dataquality)) +
  geom_bar(position = "dodge") +
  labs(title = "Conscisentious Answers by Participants",
       x = "Participant Code",
       y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
        legend.position = "bottom")

# remove non-conscientious answers
ma_subset <- ma_subset[ma_subset$dataquality == "I have answered the questions conscientiously",]
# assigning correct NAs to description
ma_subset$description[nchar(ma_subset$description) <= 3] <- NA
#ma_subset$description[!is.na(ma_subset$description)]


5.3 Interactions

Let’s first have a look at what kind of interactions participants report on with respect to group sizes and interaction partner gender

# partner gender
#table(ma_subset$personalParticipantCode, ma_subset$partner_gender)
#round(prop.table(table(ma_subset$personalParticipantCode, ma_subset$partner_gender),margin = 1),2)

ggplot(ma_subset, aes(x = personalParticipantCode, fill = partner_gender)) +
  geom_bar() +
  labs(title = "Stacked Bar Graph of Participant Code and Partner Gender",
       x = "Participant Code",
       y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

# Group size
ma_subset %>%
  ggplot(aes(x = personalParticipantCode, y = groupsize)) +
  geom_boxplot(fill = "skyblue", outlier.color = "red") +
  labs(
    title = "Group size Distributions per Participant",
    x = "Participant Code",
    y = "Group Size"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Checking the outliers

ma_subset$description[ma_subset$groupsize > 1000]
## [1] "stadion"
# Group size (excluding outliers)
ma_subset[ma_subset$groupsize <= 30,] %>%
  ggplot(aes(x = personalParticipantCode, y = groupsize)) +
  geom_boxplot(fill = "skyblue", outlier.color = "red") +
  labs(
    title = "Group size Distributions per Participant",
    x = "Participant Code",
    y = "Group Size"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Lets also check the reported duration of interactions

# duration
#table(ma_subset$personalParticipantCode, ma_subset$duration)
#round(prop.table(table(ma_subset$personalParticipantCode, ma_subset$duration),margin = 1),2)

ggplot(ma_subset, aes(x = personalParticipantCode, fill = duration)) +
  geom_bar() +
  labs(title = "Stacked Bar Graph of Participant Code and duration",
       x = "Participant Code",
       y = "Count") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))


5.4 Emotionality

Before we can now have a look at the reported emotional impact of the interactions, we still have to format missing values.

# formatting non-responses
for (i in seq_along(ma_subset[,14:25])) {
  ma_subset[,14:25][i][ma_subset[,14:25][i] < 0] <- NA
}

Finally, let’s check the emotionality of reported social interactions!

# Interaction Features
summary_by_ppt <- ma_subset %>%
  pivot_longer(
    cols = c(interaction_quality,
             interaction_intimacy,
             interaction_control,
             interaction_intensity),
    names_to = "measure",
    values_to = "value"
  ) %>%
  group_by(personalParticipantCode, measure) %>%
  summarise(summary_stats = list(summary(value)), .groups = "drop") %>%
  unnest_wider(summary_stats)
#print(summary_by_ppt)

# All ppts
ma_subset %>%
  pivot_longer(
    cols = c(interaction_quality, interaction_intimacy, interaction_control, interaction_intensity),
    names_to = "measure",
    values_to = "value"
  ) %>%
  ggplot(aes(x = measure, y = value, fill = measure)) +
  geom_boxplot() +
  labs(title = "Distribution of Interaction Measures (All Participants)",
       x = "Measure",
       y = "Value") +
  theme_minimal() +
  scale_fill_brewer(palette = "Set2") +
  theme(legend.position = "none")

# Emotionality Features
summary_by_ppt <- ma_subset %>%
  pivot_longer(
    cols = c(emo_happy,
             emo_sad,
             emo_nervous,
             emo_surprise,
             emo_angry,
             emo_embarrassed,
             emo_ashamed),
    names_to = "measure",
    values_to = "value"
  ) %>%
  group_by(personalParticipantCode, measure) %>%
  summarise(summary_stats = list(summary(value)), .groups = "drop") %>%
  unnest_wider(summary_stats)
#print(summary_by_ppt)

# All ppts
ma_subset %>%
  pivot_longer(
    cols = c(emo_happy,
             emo_sad,
             emo_nervous,
             emo_surprise,
             emo_angry,
             emo_embarrassed,
             emo_ashamed),
    names_to = "measure",
    values_to = "value"
  ) %>%
  ggplot(aes(x = measure, y = value, fill = measure)) +
  geom_boxplot() +
  labs(title = "Distribution of Emotion Measures (All Participants)",
       x = "Emotion Measure",
       y = "Value") +
  theme_minimal() +
  scale_fill_brewer(palette = "Set3") +
  theme(legend.position = "none",
        axis.text.x = element_text(angle = 45, hjust = 1))





6 Answering our research question

The question we wanted to investigate / replicate was based on this finding

We predicted and found that sex-related differences in emotion in global self-descriptions, but not in the averaged momentary ratings of emotion.

After having explored both the global ratings of emotional intensity and the in-situ ratings of emotional intensity for social interactions, we can look at them in combination. To do so, we first need to combine the information from the entry survey (gender,AIM score) with the information from the momentary assessment (interaction_intensity).

# we can combine the information of both dataframes into one
#dataframe to enable more sophisticated analyses
# to do so, we need to match the participant IDs and join the
#values from the entry long dataframe with the esm beeps.
#colnames(entry) # take over: committed, scheduled, published,
#aim_score, fill_in_time, sex, gender, age
#colnames(ma) # keep all

# join entry survey
emotionality_study <- ma %>%
  left_join(entry %>% select(personalParticipantCode,
                             committed,
                             scheduled,
                             published,
                             aim_score,
                             gender,
                             sex,
                             age), by = "personalParticipantCode")

# fixing same column names
#colnames(emotionality_study)
colnames(emotionality_study)[29] <- "committed_entry_survey"
colnames(emotionality_study)[30] <- "scheduled_entry_survey"
colnames(emotionality_study)[31] <- "published_entry_survey"
colnames(emotionality_study)[33] <- "fill_in_time_entry_survey"


6.1 Global Emotional Intensity

# Aim score by gender
by(emotionality_study$aim_score, emotionality_study$gender, summary)
## emotionality_study$gender: Female
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    5.15    5.15    5.15    5.15    5.15    5.15      27 
## ------------------------------------------------------------ 
## emotionality_study$gender: Male
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   3.550   3.550   4.200   4.051   4.400   4.400      32 
## ------------------------------------------------------------ 
## emotionality_study$gender: Another gender (e.g. non-binary)
## NULL
ggplot(emotionality_study, aes(x = gender,
                               y = aim_score,
                               fill = gender)) +
  geom_boxplot() +
  labs(title = "Side-by-Side Boxplot of Aim Score by Gender",
       x = "Gender",
       y = "Score") +
  theme_minimal() +
  scale_fill_brewer(palette = "Pastel1")


6.2 In-situ Emotional Intensity

Lets start out with the most basic comparison

# overall diff plot
ggplot(emotionality_study, aes(x = gender, y = as.numeric(interaction_intensity), fill = gender)) +
  geom_boxplot() +
  scale_fill_manual(values = c("Female" = "lightcoral", "Male" = "lightblue")) +
  labs(
    title = "Interaction Intensity by Gender",
    x = "Gender",
    y = "Interaction Intensity"
  ) +
  theme_minimal()

We have a nested structure of in-situ measurements with surveys nested in days, days nested in participants and participants nested in gender.

To compare male and female participants we need to aggregate the in-situ measurements to the participant level. We can do this by calculating the mean (or median) of the interaction_intensity for each participant and then averaging over the averages to generate overall scores.

Note: The better option to check data like these are true multilevel modelling approaches. However, these do not make sense with our N < 10 toy example. To learn about these, feel free to check out our upcoming 2 day GESIS workshop

Aggregeting in-situ measurements

participant_summary <- emotionality_study %>%
  filter(!is.na(interaction_intensity), interaction_intensity != -1) %>%
  group_by(personalParticipantCode) %>%
  summarise(
    mean_interaction_intensity = mean(as.numeric(interaction_intensity), na.rm = TRUE),
    gender = first(gender),
    aim_score = first(aim_score)
  ) %>%
  ungroup()

participant_summary

Lets check the overall correlation between AIM and in-situ interaction intensitiy

ggplot(participant_summary, aes(x = aim_score, y = mean_interaction_intensity)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "darkgray") +
  stat_cor(method = "pearson", label.x = 3, label.y = 4.5) +
  labs(
    title = "Correlation between AIM Score and Avg. Interaction Intensity",
    x = "AIM Score",
    y = "Mean Interaction Intensity"
  ) +
  theme_minimal()

And finally, lets check the gender differences based on aggregated in-situ measurements

ggplot(participant_summary, aes(x = gender, y = mean_interaction_intensity, fill = gender)) +
  geom_boxplot() +
  scale_fill_manual(values = c("Female" = "lightcoral", "Male" = "lightblue")) +
  labs(
    title = "Average Interaction Intensity by Gender",
    x = "Gender",
    y = "Mean Interaction Intensity"
  ) +
  theme_minimal()

6.3 Other resources

  • The ESM Preprocessing Gallery is a great resource for code snippets for working with momentary assessment data maintained by Jordan Revol and colleagues at KU Leuven.